Consonant confusion structure based on machine classification of visual features in continuous speech

نویسندگان

Jianxia Xue

Jintao Jiang

Abeer Alwan

Lynne E. Bernstein

چکیده

This study is a first step in selecting an appropriate subword unit representation to synthesize highly intelligible 3D talking faces. Consonant confusions were obtained with optic features from a 320-sentence database, spoken by a male talker, using Gaussian mixture models and maximum a posteriori classification methods. The results were compared to consonant confusions obtained from visual-only human perception tests of non-sense CV syllables spoken by the same talker. At the phoneme level, machine classification results for the continuous speech database had worse performance than human perception with isolated syllables. However, the number of distinguishable consonant clusters by machine is equal to that by humans. To model the optic feature for continuous visual speech synthesis, the results suggest that for most consonants, modeling optic feature in phoneme level is more appropriate than modeling in phoneme clusters determined from visual-only human perception tests. For some consonants, modeling in a context-dependent manner might be helpful in improving the modeling accuracy for the talker studied in this paper.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...

متن کامل

Classification of emotional speech using spectral pattern features

Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...

متن کامل

P65: Speech Recognition Based on Bbrain Signals by the Quantum Support Vector Machine for Inflammatory Patient ALS

People communicate with each other by exchanging verbal and visual expressions. However, paralyzed patients with various neurological diseases such as amyotrophic lateral sclerosis and cerebral ischemia have difficulties in daily communications because they cannot control their body voluntarily. In this context, brain-computer interface (BCI) has been studied as a tool of communication for thes...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

Consonant Classification using Decision Directed Acyclic Graph Support Vector Machine Algorithm

This paper presents a statistical learning algorithm based on Support Vector Machines (SVMs) for the classification of Malayalam Consonant – Vowel (CV) speech unit in noisy environments. We extend SVM for multiclass classification using Decision Directed Acyclic Graph Support Vector Machine (DDAGSVM) algorithm. For classification, acoustical features are extracted using Wavelet Transform (WT) b...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Consonant confusion structure based on machine classification of visual features in continuous speech

نویسندگان

چکیده

منابع مشابه

Improving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms

Classification of emotional speech using spectral pattern features

P65: Speech Recognition Based on Bbrain Signals by the Quantum Support Vector Machine for Inflammatory Patient ALS

Speech Emotion Recognition Using Scalogram Based Deep Structure

Consonant Classification using Decision Directed Acyclic Graph Support Vector Machine Algorithm

عنوان ژورنال:

اشتراک گذاری